Levi Jiang lfjiang@uw.edu
Photo:The Philadelphia Inquirer
The 2025 election for New Jersey’s Governor is getting intense. The New York Times has described it as “The race features the largest field in decades. The candidates have never been more accomplished or better funded. The results could be a bellwether for midterm congressional elections.”
New Jersey is one of only two states (another one is Virginia) that holds a governor election in the year following a presidential election. For Democrats, it could reveal if they are winning back New Jersey voters who shifted right in the 2024 election. This makes the election an indicator of the popularity of the current Trump administration and the midterm elections in 2026.
As the race heats up, my story will examine the source of each
candidate’s election funds, specifically, their donations. This article
will disclose the identities of donors, and explore whether there are
any patterns worth noting.
On June 10th, Six Democrats and five Republicans will compete in the
primary on June 10th for their party’s nomination, and the winners will
advance to the final election in November. The primaries are bound to
attract public attention, and this is the right time to inform readers
about the candidates’ campaign finances.
We want to help the readers deepen their understanding of this
election and each candidate, motivate more people to vote, and
facilitate their decision-making.
There’s not so much data analysis about this election. The New York Times has an article about who the candidates are and their political stances: (https://www.nytimes.com/2025/06/02/nyregion/new-jersey-governors-race-what-to-know.html) (https://www.nytimes.com/interactive/2025/03/21/nyregion/new-jersey-governor-candidates-issues.html)
Politico also analyzed everyone’s advantages and disadvantages: (https://www.politico.com/interactives/2025/nj-governor-candidates-comparison-analysis/)
Ballotpedia made a map of the results of New Jersey county
Democratic Party conventions: (https://datawrapper.dwcdn.net/Ho3TT/3/)
At this point we know that there will be 11 candidates running for the Governor. Thoroughly analyzing all of them in a week will be difficult. Rob and I shared a great idea: analyze the top candidates in the polls. The idea of analyzing the rest of the candidates is the same as them, as is the visualization process.
According to the newest poll conducted by Emerson College Polling in mid May, both parties had shown strong leanings on their top candidate. For the Democrats, Mikie Sherrill has a clear lead at the moment, followed by Steven Fulop, Ras Baraka, Josh Gottheimer and Sean Spiller in a neck-and-neck race. For Republicans, Jack Ciattarelli is way out in front. In this story, we will focus on the donations of Mikie Sherrill and Jack Ciattarelli.
The New Jersey Election Law Enforcement Commission has the
record of every candidate’s donation record, and that is where we
acquired all the data.
Load software libraries
#install.packages("tidyverse")
#install.packages("janitor")
#install.packages("readxl")
#install.packages("rvest")
#install.packages("dplyr")
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(janitor)
##
## Attaching package: 'janitor'
##
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
library(readxl)
library(rvest)
##
## Attaching package: 'rvest'
##
## The following object is masked from 'package:readr':
##
## guess_encoding
library(dplyr)
Load data
sherrill <- read.csv("../data/sherrill_mikie_Cont_638821188859939969.csv") |>
clean_names()
ciattarelli <- read.csv("../data/Ciattarelli_Jack_Cont_638845662827036224.csv") |>
clean_names()
Explore the data types
combo <- rbind(sherrill, ciattarelli)
glimpse(combo)
## Rows: 3,734
## Columns: 23
## $ is_individual <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", …
## $ first_name <chr> "", "", "", "", "", "", "", "", "", "", "", "", ""…
## $ mi <chr> "", "", "", "", "", "", "", "", "", "", "", "", ""…
## $ last_name <chr> "", "", "", "", "", "", "", "", "", "", "", "", ""…
## $ suffix <chr> "", "", "", "", "", "", "", "", "", "", "", "", ""…
## $ non_ind_name <chr> "BLOOMFIELD DEMOCRATIC COMMITTEE", "MITTEN PAC", "…
## $ street <chr> "211 N 15TH STREET", "PO BOX 4145", "PO BOX 1843",…
## $ city <chr> "BLOOMFIELD", "EAST LANSING", "ALEXANDRIA", "MONTC…
## $ state <chr> "NJ", "MI", "VA", "NJ", "NJ", "DC", "DC", "DE", "N…
## $ zip <chr> "7003", "48826", "22313", "7043", "7104", "20003",…
## $ emp_name <chr> "", "", "", "", "", "", "", "", "", "", "", "", ""…
## $ emp_street <chr> "", "", "", "", "", "", "", "", "", "", "", "", ""…
## $ emp_city <chr> "", "", "", "", "", "", "", "", "", "", "", "", ""…
## $ emp_state <chr> "", "", "", "", "", "", "", "", "", "", "", "", ""…
## $ emp_zip <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ occupation_name <chr> "", "", "", "", "", "", "", "", "", "", "", "", ""…
## $ contributor_type <chr> "POLITICAL PARTY CMTE", "POLITICAL CMTE", "POLITIC…
## $ contribution_type <chr> "MONETARY", "MONETARY", "MONETARY", "MONETARY", "M…
## $ contribution_date <chr> "12/31/24", "11/18/24", "11/22/24", "11/18/24", "1…
## $ contribution_amount <dbl> 500, 2500, 2500, 5800, 500, 5800, 5800, 5800, 5800…
## $ entity_name <chr> "SHERRILL, MIKIE", "SHERRILL, MIKIE", "SHERRILL, M…
## $ location <chr> "STATEWIDE", "STATEWIDE", "STATEWIDE", "STATEWIDE"…
## $ election_year <int> 2025, 2025, 2025, 2025, 2025, 2025, 2025, 2025, 20…
The data set has very detailed information of all the donations each
candidate received. Donor’s name, zip code, address, employer, data,
etc. There are some errors with the data format to fix.
Fix dates
combo <- combo |>
mutate(date = lubridate::mdy(contribution_date))
## Warning: There was 1 warning in `mutate()`.
## ℹ In argument: `date = lubridate::mdy(contribution_date)`.
## Caused by warning:
## ! 113 failed to parse.
sherrill <- sherrill |>
mutate(date = lubridate::mdy(contribution_date))
## Warning: There was 1 warning in `mutate()`.
## ℹ In argument: `date = lubridate::mdy(contribution_date)`.
## Caused by warning:
## ! 113 failed to parse.
ciattarelli <- ciattarelli |>
mutate(date = lubridate::mdy(contribution_date))
After that, we can have an overall picture of the donations.
Data exploration: totals by candidate
combo |>
select(entity_name, contribution_amount) |>
group_by(entity_name) |>
summarize(total = sum(contribution_amount)) |>
arrange(desc(total))
## # A tibble: 2 × 2
## entity_name total
## <chr> <dbl>
## 1 CIATTARELLI, JACK 8862080.
## 2 SHERRILL, MIKIE 1343168.
So we can see the huge gap between those two candidates.
Ciattarelli’s received donations are almost 7 times more than
Sherrill’s.
Put aside the difference we noticed, there’s still some problem with our data. The original spreadsheet mix the individual donors and organizations, so we need to create an id for each donor. Also, some donors made more than one donation or changes the amount, then we need to figure out the exact money they donated.
# Data Clean Up
# Step 1: Create a donor ID (individual name or organization name)
sherrill_donations_clean <- sherrill %>%
mutate(
donor_id = case_when(
!is.na(first_name) & first_name != "" &
!is.na(last_name) & last_name != "" ~ str_trim(paste(first_name, mi, last_name, suffix)),
!is.na(non_ind_name) & non_ind_name != "" ~ str_trim(non_ind_name),
TRUE ~ "Unknown"
)
)
# Step 2: Group by donor_id and sum up all contribution_amount
sherrill_donor_totals <- sherrill_donations_clean %>%
group_by(donor_id) %>%
summarise(final_donation = sum(contribution_amount, na.rm = TRUE), .groups = "drop")
# Step 3: Merge the final total back to the original dataset
sherrill_donations_with_total <- sherrill_donations_clean %>%
left_join(sherrill_donor_totals, by = "donor_id")
# Step 4: Select relevant columns from the original data
sherrill_donor_details <- sherrill_donations_with_total %>%
select(
donor_id, final_donation, is_individual, date,
street, city, state, zip,
emp_name, emp_street, emp_city, emp_state, emp_zip,
occupation_name, contributor_type
)
# Step 5: Keep only the newest row for each donor based on date
# Convert date to proper Date type if it's not already
sherrill_donor_details <- sherrill_donor_details %>%
mutate(date = as.Date(date))
sherrill_latest_details <- sherrill_donor_details %>%
arrange(donor_id, desc(date)) %>%
group_by(donor_id) %>%
slice(1) %>% # Keep only the newest record
ungroup()
# Step 6: Double check all the data. For Sherrill's case, the zip format needs to be fixed
sherrill_latest_details <- sherrill_latest_details %>%
mutate(zip = ifelse(nchar(zip) == 4, paste0("0", zip), as.character(zip)))
# So as for another candidate
ciattarelli_donations_clean <- ciattarelli %>%
mutate(
donor_id = case_when(
!is.na(first_name) & first_name != "" &
!is.na(last_name) & last_name != "" ~ str_trim(paste(first_name, mi, last_name, suffix)),
!is.na(non_ind_name) & non_ind_name != "" ~ str_trim(non_ind_name),
TRUE ~ "Unknown"
)
)
ciattarelli_donor_totals <- ciattarelli_donations_clean %>%
group_by(donor_id) %>%
summarise(final_donation = sum(contribution_amount, na.rm = TRUE), .groups = "drop")
ciattarelli_donations_with_total <- ciattarelli_donations_clean %>%
left_join(ciattarelli_donor_totals, by = "donor_id")
ciattarelli_donor_details <- ciattarelli_donations_with_total %>%
select(
donor_id, final_donation, is_individual, date,
street, city, state, zip,
emp_name, emp_street, emp_city, emp_state, emp_zip,
occupation_name, contributor_type
)
ciattarelli_donor_details <- ciattarelli_donor_details %>%
mutate(date = as.Date(date))
ciattarelli_latest_details <- ciattarelli_donor_details %>%
arrange(donor_id, desc(date)) %>%
group_by(donor_id) %>%
slice(1) %>% # Keep only the newest record
ungroup()
Q1: Who are the biggest donors to each candidate? How much
did they donate? (If we can find some connections between candidates and
executives from health care corporations, union members etc, we can run
deeper investigations on that.)
For candidate Mikie Sherrill:
sherrill_top_donations <- sherrill_latest_details %>%
arrange(desc(final_donation)) %>%
slice_head(n = 200)
print(sherrill_top_donations)
## # A tibble: 200 × 15
## donor_id final_donation is_individual date street city state zip
## <chr> <dbl> <chr> <date> <chr> <chr> <chr> <chr>
## 1 ADAM BUCHS… 5800 Y 2024-12-13 19 GR… MADI… NJ 07940
## 2 ALEXANDER … 5800 Y NA 4222 … WASH… DC 20011
## 3 ALFRED CLA… 5800 Y 2024-11-18 4440 … PALM… FL 33410
## 4 ALIX JENNI… 5800 Y 2024-11-17 6 EDG… MADI… NJ 07940
## 5 ALLEN BLUE 5800 Y 2024-12-30 750 N… WEST… CA 90069
## 6 AMY BUDETTI 5800 Y 2024-12-30 3 ELI… MONT… NJ 07043
## 7 AMY KELLEY 5800 Y 2024-11-17 351 N… SOUT… NJ 07079
## 8 ANA HERRER… 5800 Y 2024-11-21 8940 … PINE… FL 33156
## 9 ANDREW SIM… 5800 Y 2024-11-19 249 B… RIDG… NJ 07450
## 10 ANDY BERNDT 5800 Y 2024-11-15 89 DU… MAPL… NJ 07040
## # ℹ 190 more rows
## # ℹ 7 more variables: emp_name <chr>, emp_street <chr>, emp_city <chr>,
## # emp_state <chr>, emp_zip <int>, occupation_name <chr>,
## # contributor_type <chr>
As we can see, the money they donated are the same, so it’s hard to tell who’s the “biggest” boss. According to NJ Election Law Enforcement Commission, $5,800 is the donation limit to individual candidates. Among Sherrill’s 525 donors, 178 of them (34%) had reached the donation limit.
We can also make a table of these donors:
And for candidate Jack Ciattarelli:
ciattarelli_top_donations <- ciattarelli_latest_details %>%
arrange(desc(final_donation)) %>%
slice_head(n = 200)
print(ciattarelli_top_donations)
## # A tibble: 200 × 15
## donor_id final_donation is_individual date street city state zip
## <chr> <dbl> <chr> <date> <chr> <chr> <chr> <chr>
## 1 NEW JERSEY … 5500000 N 2025-03-31 P.O. … TREN… NJ 0862…
## 2 ADAM KRAUS… 5800 Y 2024-10-27 2 TOP… MARL… NJ 07746
## 3 ADAM SCHRE… 5800 Y 2024-09-27 622 W… TEAN… NJ 07666
## 4 ADMORE AIR … 5800 N 2024-08-13 835 M… YONK… NY 10704
## 5 ALBERT P LEE 5800 Y 2025-02-18 2379 … MENL… CA 94025
## 6 ALEXANDER D… 5800 Y 2024-11-25 189 W… PENN… NJ 08534
## 7 ALFRED J GA… 5800 Y 2024-06-30 PO BO… BLAW… NJ 08504
## 8 ALISON GLA… 5800 Y 2024-12-09 269 C… WYCK… NJ 07481
## 9 ALLEN BIRD 5800 Y 2025-03-17 710 1… ARLI… VA 22202
## 10 AMIR GOLDM… 5800 Y 2024-09-30 325 S… MERI… PA 19066
## # ℹ 190 more rows
## # ℹ 7 more variables: emp_name <chr>, emp_street <chr>, emp_city <chr>,
## # emp_state <chr>, emp_zip <int>, occupation_name <chr>,
## # contributor_type <chr>
The top donor has drawn our attention: NJ Election Law Enforcement Commission. The record shows that Commission will provide $5.5 million public matching fund, which is also the Primary Public Fund Cap, for candidate Ciattarelli. And in exchange, Ciattarelli must follow the $8.7 million Primary Expenditure Limit. Candidate Sherrill is also qualified for the maximum public fund of $5.5 million, but she hasn’t officialy used that for her race.
Besides that, for Ciattarelli, there are 192 of 2120 donors (9%) reached the donation limit. Thus, although Sherrill had significantly fewer donations and donors than Ciattarelli, a higher percentage of her donors made big donations.
We can also dig deeper in this topic. E.g. Check the distribution of
their received donations and see how many of them are big donations, and
how many of them are grassroots donations (donations from individuals of
less than $200). This can lead to a bar chart.
Q2: Do certain candidates receive more donations from specific industries?
# I'm still working on this part!
Q3: Are there out-state contributors making significant
impacts on the election? (And also for in-state donors, we can use their
zip code to find some patterns.)
For Mikie Sherrill:
donations_money_by_state_sherrill <- sherrill_latest_details %>%
group_by(state) %>%
summarise(Total = sum(final_donation), .groups = "drop") %>%
arrange(desc(Total))
print(donations_money_by_state_sherrill)
## # A tibble: 21 × 2
## state Total
## <chr> <dbl>
## 1 NJ 860496.
## 2 NY 155543.
## 3 CA 72306.
## 4 FL 58056.
## 5 DC 38300
## 6 MA 25718.
## 7 VA 22470.
## 8 MD 18109
## 9 PA 18047.
## 10 CT 13155.
## # ℹ 11 more rows
64% of the contributions were from within the state. In
addition to New Jersey, supporters in 20 other states have donated to
her. The states with big donations are New York, California, Florida,
Washington, D.C., Maryland, and Virginia.
For Jack Ciattarelli:
donations_money_by_state_ciattarelli <- ciattarelli_latest_details %>%
group_by(state) %>%
summarise(Total = sum(final_donation), .groups = "drop") %>%
arrange(desc(Total))
print(donations_money_by_state_ciattarelli)
## # A tibble: 27 × 2
## state Total
## <chr> <dbl>
## 1 "NJ" 8309785.
## 2 "NY" 169537.
## 3 "FL" 151010.
## 4 "PA" 90612.
## 5 "" 54400
## 6 "CA" 16360
## 7 "IN" 8800
## 8 "MI" 8600
## 9 "CT" 6004.
## 10 "VA" 5800
## # ℹ 17 more rows
94% of the contributions were from New Jersey, which is much
higher than Sherrill’s since Ciattarelli got the $5.5 million in-state
public fund. However, 0.6% of the contributions could not be traced to
the state of origin. Besides NJ, there are other 25 states have donation
records. The states with the larger donations are New York, Florida,
Pennsylvania, and California.
We can also check the number of donors for each state:
donations_amount_by_state_sherrill <- sherrill_latest_details %>%
group_by(state) %>%
summarise(Count = n(), .groups = "drop") %>%
arrange(desc(Count))
print(donations_amount_by_state_sherrill)
## # A tibble: 21 × 2
## state Count
## <chr> <int>
## 1 NJ 366
## 2 NY 44
## 3 CA 37
## 4 FL 13
## 5 DC 11
## 6 MA 9
## 7 MD 9
## 8 CT 7
## 9 VA 7
## 10 PA 6
## # ℹ 11 more rows
In the case of Sherrill, for example, even though the amount of donations from Connecticut was not as high as Virginia and Pennsylvania, the number of donors was actually higher. This means that more large donations come from Connecticut.
However, in this article, the number of out-state contributors are
relatively small, making it possible to produce a polarized analysis.
While the number of donors is also an important indicator of a
candidate’s popularity, we won’t prioritize it for now.
For in-state donors, their zip codes are available in the dataset, so we can analyze the donations in the area corresponding to the zip code.
For Mikie Sherrill:
in_state_donations_sherrill <- sherrill_latest_details %>%
filter(state == "NJ") %>%
group_by(zip) %>%
summarise(total_donation = sum(final_donation, na.rm = TRUE)) %>%
arrange(desc(total_donation))
print(in_state_donations_sherrill)
## # A tibble: 87 × 2
## zip total_donation
## <chr> <dbl>
## 1 07043 111365.
## 2 07042 99222.
## 3 07960 89156.
## 4 07940 62848
## 5 07928 35270.
## 6 07052 27700
## 7 07040 24459
## 8 07006 23659
## 9 07920 22670.
## 10 07079 15127.
## # ℹ 77 more rows
Zip code 07043, which means the Upper Montclair area, has the most
donations.The donations she received are concentrated in northern New
Jersey. According to Ballotpedia, this is actually in line with what she
got in the New Jersey county Democratic Party conventions.
For Jack Ciattarelli:
in_state_donations_ciattarelli <- ciattarelli_latest_details %>%
filter(state == "NJ") %>%
group_by(zip) %>%
summarise(total_donation = sum(final_donation, na.rm = TRUE)) %>%
arrange(desc(total_donation))
print(in_state_donations_ciattarelli)
## # A tibble: 400 × 2
## zip total_donation
## <chr> <dbl>
## 1 08625-0185 5500000
## 2 08540 66573.
## 3 08008 57283.
## 4 07760 55101.
## 5 07458 54998.
## 6 07960 51395.
## 7 07417 45995.
## 8 07430 43310.
## 9 07090 39897.
## 10 08844 35257.
## # ℹ 390 more rows
Zip code 08625-0185 and the $5.5 million donation is from NJ Election Law Enforcement Commission, which doesn’t means this area’s residents donated to him. We will delete it and then draw the map.
Ciattarelli got most donations from Princeton, NJ 08540. His in-state
contributions are obviously spread over a much wider area and gave him
big popularity in central New Jersey.
Sources:
https://www.elec.nj.gov/pdffiles/press_releases/pr_2025/pr_05292025.pdf